A Review on Speech Corpus Development for Automatic Speech Recognition in Indian Languages

نویسنده

  • Cini kurian
چکیده

Cini kurian Department of computer Science, Al-Ameen college, Edathala,Aluva, Kerala [email protected] ------------------------------------------------------------------------ABSTRACT--------------------------------------------------------Corpus development gained much attention due to recent statistics based natural language processing. It has new applications in Language Technology, linguistic research, language education and information exchange. Corpus based Language research has an innovative outlook which will discard the aged linguistic theories. Speech corpus is the essential resources for building a speech recognizer. One of the main challenges faced by speech scientist is the unavailability of these resources. Very fewer efforts have been made in Indian languages to make these resources available to public compared to English. In this paper we review the efforts made in Indian languages for developing speech corpus for automatic speech recognition.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Database for Automatic Persian Speech Emotion Recognition: Collection, Processing and Evaluation

Abstract   Recent developments in robotics automation have motivated researchers to improve the efficiency of interactive systems by making a natural man-machine interaction. Since speech is the most popular method of communication, recognizing human emotions from speech signal becomes a challenging research topic known as Speech Emotion Recognition (SER). In this study, we propose a Persian em...

متن کامل

Allophone-based acoustic modeling for Persian phoneme recognition

Phoneme recognition is one of the fundamental phases of automatic speech recognition. Coarticulation which refers to the integration of sounds, is one of the important obstacles in phoneme recognition. In other words, each phone is influenced and changed by the characteristics of its neighbor phones, and coarticulation is responsible for most of these changes. The idea of modeling the effects o...

متن کامل

A Survey on Speech Recogntion in Indian Languages

Speech recognition applications are becoming common and useful in this day and age as many of the modern devices are designed and produced user-friendly for the convenience of general public. Speaking/communicating directly with the machine to achieve desired objectives make usage of modern devices easier and convenient. Although many interactive software applications are available, the uses of...

متن کامل

Automatic extraction of phonetically rich sentences from large text corpus of indian languages

A set of phonetically rich sentences is a requirement for representing different speech units, to be used for developing Automatic Speech Recognition and Speech Synthesis Systems. Selecting such a set from a large text corpus without modifying the characteristics of the corpus is still a difficult task. A major concern in this process is to decide on what basis sentences must be chosen so that ...

متن کامل

Collecting and Evaluating Speech Recognition Corpora for Nine Southern Bantu Languages

We describe the Lwazi corpus for automatic speech recognition (ASR), a new telephone speech corpus which includes data from nine Southern Bantu languages. Because of practical constraints, the amount of speech per language is relatively small compared to major corpora in world languages, and we report on our investigation of the stability of the ASR models derived from the corpus. We also repor...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015